average length
8fd7f981e10b41330b618129afcaab2d-Supplemental.pdf
In this supplemental material, we provide additional details on the theory,the algorithms, and the experiments. In Section 2, we continue analyzing the population level difference between Trojaned and clean models, with a focus on the short-cuts. A boundaryoperatorona p-simplextakesallitsadjacent (p 1)-simplices.Inparticular,theboundary of an edges consists of its adjacent nodes; the boundary of a triangle consists of its three edges. More generally, the boundary of ap-chain is the formal sum1 of the boundary of all its elements, (c)= P σ c (c). Afterthereduction,thepivoting entries of the reduced matrix correspond to pairs of simplices.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
- North America > Canada > British Columbia > Vancouver (0.05)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- North America > United States > Wisconsin (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (3 more...)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > New York > Suffolk County > Stony Brook (0.04)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.05)
Neural Bayes inference for complex bivariate extremal dependence models
André, Lídia M., Wadsworth, Jennifer L., Huser, Raphaël
Likelihood-free approaches are appealing for performing inference on complex dependence models, either because it is not possible to formulate a likelihood function, or its evaluation is very computationally costly. This is the case for several models available in the multivariate extremes literature, particularly for the most flexible tail models, including those that interpolate between the two key dependence classes of `asymptotic dependence' and `asymptotic independence'. We focus on approaches that leverage neural networks to approximate Bayes estimators. In particular, we explore the properties of neural Bayes estimators for parameter inference for several flexible but computationally expensive models to fit, with a view to aiding their routine implementation. Owing to the absence of likelihood evaluation in the inference procedure, classical information criteria such as the Bayesian information criterion cannot be used to select the most appropriate model. Instead, we propose using neural networks as neural Bayes classifiers for model selection. Our goal is to provide a toolbox for simple, fast fitting and comparison of complex extreme-value dependence models, where the best model is selected for a given data set and its parameters subsequently estimated using neural Bayes estimation. We apply our classifiers and estimators to analyse the pairwise extremal behaviour of changes in horizontal geomagnetic field fluctuations at three different locations.
- Europe > United Kingdom (0.14)
- Europe > Belgium > Wallonia > Namur Province > Namur (0.04)
- North America > Greenland (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
AutoPatent: A Multi-Agent Framework for Automatic Patent Generation
Wang, Qiyao, Ni, Shiwen, Liu, Huaren, Lu, Shule, Chen, Guhong, Feng, Xi, Wei, Chi, Qu, Qiang, Alinejad-Rokny, Hamid, Lin, Yuan, Yang, Min
As the capabilities of Large Language Models (LLMs) continue to advance, the field of patent processing has garnered increased attention within the natural language processing community. However, the majority of research has been concentrated on classification tasks, such as patent categorization and examination, or on short text generation tasks like patent summarization and patent quizzes. In this paper, we introduce a novel and practical task known as Draft2Patent, along with its corresponding D2P benchmark, which challenges LLMs to generate full-length patents averaging 17K tokens based on initial drafts. Patents present a significant challenge to LLMs due to their specialized nature, standardized terminology, and extensive length. We propose a multi-agent framework called AutoPatent which leverages the LLM-based planner agent, writer agents, and examiner agent with PGTree and RRAG to generate lengthy, intricate, and high-quality complete patent documents. The experimental results demonstrate that our AutoPatent framework significantly enhances the ability to generate comprehensive patents across various LLMs. Furthermore, we have discovered that patents generated solely with the AutoPatent framework based on the Qwen2.5-7B model outperform those produced by larger and more powerful LLMs, such as GPT-4o, Qwen2.5-72B, and LLAMA3.1-70B, in both objective metrics and human evaluations. We will make the data and code available upon acceptance at \url{https://github.com/QiYao-Wang/AutoPatent}.
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Oceania > Australia > New South Wales (0.04)
- (9 more...)
Simple stochastic processes behind Menzerath's Law
This paper revisits Menzerath's Law, also known as the Menzerath-Altmann Law, which models a relationship between the length of a linguistic construct and the average length of its constituents. Recent findings indicate that simple stochastic processes can display Menzerathian behaviour, though existing models fail to accurately reflect real-world data. If we adopt the basic principle that a word can change its length in both syllables and phonemes, where the correlation between these variables is not perfect and these changes are of a multiplicative nature, we get bivariate log-normal distribution. The present paper shows, that from this very simple principle, we obtain the classic Altmann model of the Menzerath-Altmann Law. If we model the joint distribution separately and independently from the marginal distributions, we can obtain an even more accurate model by using a Gaussian copula. The models are confronted with empirical data, and alternative approaches are discussed.
- Europe > Netherlands > South Holland > Dordrecht (0.05)
- Europe > Czechia > Prague (0.05)
- Europe > United Kingdom (0.04)
- (3 more...)
On the Transferability of Visually Grounded PCFGs
There has been a significant surge of interest in visually grounded grammar induction in recent times. While a variety of models have been developed for the task and have demonstrated impressive performance, they have not been evaluated on text domains that are different from the training domain, so it is unclear if the improvements brought by visual groundings are transferable. Our study aims to fill this gap and assess the degree of transferability. We start by extending VC-PCFG (short for Visually-grounded Compound PCFG~\citep{zhao-titov-2020-visually}) in such a way that it can transfer across text domains. We consider a zero-shot transfer learning setting where a model is trained on the source domain and is directly applied to target domains, without any further training. Our experimental results suggest that: the benefits from using visual groundings transfer to text in a domain similar to the training domain but fail to transfer to remote domains. Further, we conduct data and result analysis; we find that the lexicon overlap between the source domain and the target domain is the most important factor in the transferability of VC-PCFG.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- Europe > Italy > Tuscany > Florence (0.04)
- (5 more...)
Why do We use Cross-entropy in Deep Learning -- Part 2
Entropy, Cross-entropy, Binary Cross-entropy, and Categorical Cross-entropy are crucial concepts in Deep Learning and one of the main loss functions used to build Neural Networks. All of them derive from the same concept: Entropy, which may be familiar to you from physics and chemistry. However, not many courses or articles explain the terms in-depth, since it requires some time and mathematics to do it correctly. In the first post, I presented three different but related conceptions of entropy and where its formula derives from. However, there is still one key concept to address, since Deep Learning does not use Entropy but a close relative of it called Cross-entropy.